On the Estimation of Missing Data in Incomplete Databases: Autoregressive Bayesian Networks
نویسندگان
چکیده
Missing data can be estimated by means of interpolation, time series modelling, or exploiting statistically dependent information. The limits of when one approach is preferable to the alternatives have not been explored, but are likely to be a compromise between a signal autoregressive information, availability of future observations, stationary behaviour and the strength of the dependence with concomitant information. This paper takes a first step towards clarifying dataset characteristics delimiting the realm of application for each technique. In addition, this paper introduces autoregressive Bayesian networks (AR-BN), a variant of Dynamic Bayesian Networks for completing databases which exploits latent variable relations while still benefitting from autoregressive information of the variable being filled. Using AR-BN, new estimated values are calculated using inference in the dynamic model. Our results unveil how the interplay between the variable autoregressive information and the variable relationship to others in the dataset is critical to selecting the optimal data estimation technique. AR-BN appears as a good candidate ensuring a consistent performance across scenarios, datasets and error metrics. Keywords-dynamic probabilistic graphical models; incomplete data series; value estimation; knowledge discovery; autoregressive models.
منابع مشابه
Structure of Wavelet Covariance Matrices and Bayesian Wavelet Estimation of Autoregressive Moving Average Model with Long Memory Parameter’s
In the process of exploring and recognizing of statistical communities, the analysis of data obtained from these communities is considered essential. One of appropriate methods for data analysis is the structural study of the function fitting by these data. Wavelet transformation is one of the most powerful tool in analysis of these functions and structure of wavelet coefficients are very impor...
متن کاملA BAYESIAN APPROACH TO COMPUTING MISSING REGRESSOR VALUES
In this article, Lindley's measure of average information is used to measure the information contained in incomplete observations on the vector of unknown regression coefficients [9]. This measure of information may be used to compute the missing regressor values.
متن کاملLearning Bayesian networks from incomplete databases using a novel evolutionary algorithm
This paper proposes a novel method for learning Bayesian networks from incomplete databases in the presence of missing values, which combines an evolutionary algorithm with the traditional Expectation Maximization (EM) algorithm. A data completing procedure is presented for learning and evaluating the candidate networks. Moreover, a strategy is introduced to obtain better initial networks to fa...
متن کاملLearning Bayesian Networks from Incomplete Databases
Bayesian approaches to learn the graphical structure of Bayesian Belief Networks (BBNs) from databases share the assumption that the database is complete, that is, no entry is re ported as unknown. Attempts to relax this assumption involve the use of expensive it erative methods to discriminate among dif ferent structures. This paper introduces a deterministic method to learn the graphical s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013